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Abstract 



Bayesian inference for fractionally integrated exponential generalized autoregressive con- 

, , ditional heteroskedastic (FIEGARCH) models using Markov Chain Monte Carlo (MCMC) 

p-H methods is described. A simulation study is presented to access the performance of the 

procedure, under the presence of long-memory in the volatility. Samples from FIEGARCH 
processes are obtained upon considering the generalized error distribution (GED) for the in- 
novation process. Different values for the tail-thickness parameter v are considered covering 
both scenarios, innovation processes with lighter (y < 2) and heavier (v > 2) tails than the 
Gaussian distribution (y = 2). A sensitivity analysis is performed by considering different 
prior density functions and by integrating (or not) the knowledge on the true parameter 
CNI values to select the hyperparameter values. 

> 
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1 Introduction 

O 
CO 

ARCH-type (Autoregressive Conditional Heteroskedasticity) and stochastic volatility (Breidt et 

al., 1998) models are commonly used in financial time series modeling to represent the dynamic 

evolution of volatilities. By ARCH-type models we mean not only the ARCH model proposed 

H by Engle (1982) but also several generalizations that were lately proposed. 

03 

Among the most popular generalizations of the ARCH model is the generalized ARCH 
(GARCH) model, introduced by Bollerslev (1986), for which the conditional variance depends 
not only on the p past values of the process (as in the ARCH model), but also on the q past 
values of the conditional variance. Although the ARCH and GARCH models are widely used 
in practice, they do not take into account the asymmetry in the volatility, that is, the fact 
that volatility tends to rise in response to "bad" news and to fall in response to "good" news. 
As an alternative, Nelson (1991) introduces the exponential GARCH (EGARCH) model. This 
model not only describes the asymmetry on the volatility, but also have the advantage that 
the positivity of the conditional variance is always attained since it is defined in terms of the 
logarithm function. 
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2 MCMC Bayesian Estimation in FIEGARCH Models 

The fractionally integrated EGARCH (FIEGARCH) and fractionally integrated GARCH 
(FIGARCH) models proposed, respectively, by Bollerslev and Mikkelsen (1996) and Baillie et 
al. (1996), generalize the EGARCH (Nelson, 1991) and the GARCH (Bollerslev, 1986) models, 
respectively. FIEGARCH models have not only the capability of modeling clusters of volatility 
(as ARCH and GARCH models do) and capturing its asymmetry (as the EGARCH model 
does) but they also take into account the characteristic of long memory in the volatility (as 
the FIGARCH model does). The non-stationarity of FIGARCH models (in the weak sense) 
makes this class of models less attractive for practical applications. Another drawback of the 
FIGARCH models is that we must have d > and the polynomial coefficients in its definition 
must satisfy some restrictions so the conditional variance will be positive. FIEGARCH models 
do not have this problem since the variance is defined in terms of the logarithm function, 
moreover, they are weak stationary whenever the long memory parameter d is smaller than 0.5 
(Lopes and Prass, 2013). 

A complete study on the theoretical properties of FIEGARCH processes is presented in Lopes 
and Prass (2013). The authors also conduct a simulation study to analyze the finite sample 
performance of the quasi-maximum likelihood (QML) procedure on parameter estimation. The 
QML procedure became popular for two main reasons. First, the expression for the quasi- 
likelihood function is simpler for the Gaussian case than when considering, for example, the 
Student's t or the generalized error distribution (GED). Second, since the parameters of the 
distribution function are not estimated, the dimension of the optimization problem is reduced. 
On the other hand, the results in Lopes and Prass (2013) indicate that, although the QML 
presents a relatively good performance when the sample size is 2000 and the estimation improves 
as the sample size increases, it does so very slowly. 

In this work we propose the use of Bayesian methods using Monte Carlo simulation tech- 
niques on the estimation of the FIEGARCH model parameters. This procedure is usually 
considered to analyze financial time series assuming stochastic volatility models (see, for exam- 
ple, Meyer and Yu, 2000), mostly because of the difficulty on applying traditional statistical 
techniques due to the complexity of the likelihood function. To generate samples from the 
joint posterior distribution for the parameters of interest we use MCMC (Markov Chain Monte 
Carlo) methods as the Gibbs Sampling algorithm (see, for example, Gelfand and Smith, 1990; 
Casela and George, 1992) or the Metropolis-Hastings algorithm (see, for example, Smith and 
Roberts, 1993; Chib and Greenberg, 1995). These samples are generated from all conditional 
posterior distributions for each parameter given all the other parameters and the data. 

A simulation study is conducted to access the finite sample performance of the procedure 
proposed here, under the presence of long-memory in the volatility. The samples from FIE- 
GARCH processes are obtained upon considering the GED for the innovation process. Taking 
into account that financial time series are usually characterized by heavy tailed distributions, 
different values for the tail-thickness parameter v are considered covering both scenarios: in- 
novation processes with lighter and heavier tails than the Gaussian distribution. A sensitivity 
analysis is performed by considering different prior density functions and by integrating (or not) 
the knowledge on the true parameter values to select the hyperparameter values. 

The paper is organized as follows. In Section 2 a review on the definition and main properties 
of FIEGARCH processes is presented. Section 3 describes the parameter estimation procedure 
when Bayesian inference using MCMC is considered. Section 4 describes the steps used in the 
simulation study, such as the data generating process, the prior selection procedure and the 
performance measures considered. This section also reports the simulation results. Section 5 
concludes the paper. 
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2 FIEGARCH Processes 

Let (1 — B)~ d be the operator defined by its Maclaurin series expansion, namely, 

k=0 v ; v ; fc=0 

where r^k '■= r(k+i)r(d) > ^ or a ^ ^ — ^> ^(0 is the gamma function and B is the backward shift 
operator defined by B k (X t ) = X t -k, for all fc G IN". 

Assume that q(-) and j3{-) are polynomials of order p and q, respectively, defined by 

v <? 

a(z) = ^2(-a i )z i and /3(z) = £(-&y\ (2) 

j=0 j=0 

with ao = /?o = — 1- If Q (") and /?(■) have no common roots and /3(z) 7^ in the closed disk 
{z : |z| < 1}, then the function A(-), defined by 

A(z) = ?r((l - ^)" d == E ^.fc^, for all \z\ < 1, (3) 

^ zj fc=0 

is analytic in the open disc {z : |z| < 1}, for any d > 0, and in the closed disk {z : \z\ < 1}, 
whenever d < 0. Therefore, A(-) is well defined and the power series representation in (3) is 
unique. More specifically, the coefficients A^fc, for all k G IN, are given by (see Lopes and Prass, 
2013) 

fe— 1 k—i 

Ad,o = 1 and \ d ^ k = -a* k + E Al E @j S d,k-i-j, for all k > 1, (4) 

i=0 j=0 

where 

f a m , if 0<m<p; ( f3 m , if < m < q; 

< ■= $*■={ (5) 

[0, if m > p; I 0, if m > q; 

and 5dj := T-d,ji fo r all j £ M, are the coefficients obtained upon replacing — d by d in (1), that 

is 

00 00 

Y,h,kB k :=Y,T-d, 3 Bi = {l-B) d . 

k=0 j=0 

Let 9, 7 G R and {Z^tez be a sequence of independent and identically distributed (i.i.d.) 
random variables, with zero mean and variance equal to one. Assume that 6 and 7 are not both 
equal to zero and define {g(Zt)}tez by 

g(Z t ) = 0Z t + 7 [\Z t \-V(\Z t \)}, foralHGZ. (6) 

It follows that (see Lopes and Prass, 2013) {g(Zt)}tez is a strictly stationary and ergodic 
process. Moreover, since E(Zq) < 00, then {g(Zt)}tez is also weakly stationary with mean zero 
(therefore a white noise process) and variance a 2 given by 

al = 9 2 + 7 2 - hE(\Z \)} 2 + 2e 1 E(Z \Z \). (7) 
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Now, for any d < 0.5 and w£l, let {X t }tez be the stochastic process defined by 

X t = atZt, (8) 



OG 

k=0 



+ J2 ^d,k9i Z t-l-k), for all t € Z. (9) 



Then {Xt}tez is a Fractionally Integrated EGARCH process, denoted by FIEGARCH(p, d, g) 
(Bollerslev and Mikkelsen, 1996). 

The properties of FIEGARCH(j>, d, q) processes, with d < 0.5, are given below (the proofs 
of these properties can be found in Lopes and Prass, 2013). Henceforth GED(v) denotes the 
generalized error distribution with tail thickness parameter v. 

Proposition 1. Let {Xt}t£X FIEGARCH(p, d, a) process. Then the following properties hold: 

1. {ln(of )}t e ^ is a stationary (weakly and strictly) and an ergodic process and the random 
variable ln(crf ) is almost surely finite, for all t £ Z; 

2. if d £ (—1,0.5) and a(z) ^ 0, for \z\ < 1, the process {ln(o"|)}( e ^ is invertible; 

3. {Xt}tez and {afytez ire strictly stationary and ergodic processes; 

4. if {Zt}tei, is a sequence of i.i.d. GED(z^) random variables, with v > 1, zero mean and 
variance equal to one, then E(X[) < oo and E(o"| r ) < oo, for all t £ X and r > 0. 

3 Parameter Estimation: Bayesian Inference using MCMC 

Let v be the parameter (or vector of parameters) associated to the probability density function 
of Zq and denote by 

• r) = (i/,d,0,7,w,ai,--- ,a p ,Pi,'- ,/3 q )' := (771,772, •• • ,V5+ P +q)' the vector of unknown 
parameters in (9); 

• V(-i) the vector containing all parameters in rj except rji, for each i & {1, • ■ ■ , 5 + p + q}; 

• p z (-\v) the probability density function of Zq given v; 

• Ft the cr-algebra generated by {Z s } s < t ; 

• p Xt {-\rj,Ft-i) the probability density function of X t given r) and Tt-\i for all t £ Z. 

From (9) it is evident that, given 77, at is a J-(_i-measurable random variable. Moreover, 
since X t = o~tZt and Pz('\v, Ft-i) = Pz{-\v), the following equality holds 



1 fir 00 11 

p Xt (x t \ri,Ft-i) = — Pz{x t u^ 1 \v), with a t = exp < - u + ^ \d,kg{z t -i-k) > 



(10) 
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for all Xf £ R and t € Z. Furthermore, from (10), the conditional probability of X := 
(Xi, • • • , X n )' given r\ and J-q can be written as 

p x (xi,--- jXnlVjTo) = p Xn (x n \ri,x n -i,-- ■ ,xi, To) x •• • x p Xl (xi\r}, Jb) 

Given any Jo £ -^0; select a prior conditional density function Pi (-\t)) for Jo given 77. Also, 
select a prior 1 density function vtj(-) for rji and a prior conditional probability density function 
£(_<)(• |77i) for r7(_^ given % for each j G {1, • • • ,5 +p + q}. 

Observe that, by applying the Bayes' rule, the conditional probability density function of rji 
given X, Tj/{\ and any In, can be written as 

/ ,„ T -s Px{X\rj,I )xpi {lo\v) x P(-i)(V(-i)\Vi) * *i(Vi) ,,_,. 

P(rii\X,r)(-i),Io) = ttz pv , (12) 

for each i G {1, • ■ • ,5 + p + q}, where p x ( • \rf,^Fo) is given in (11) and p ( _ i) (-, •, •) is the joint 
probability density function of X, Tjr^ and Iq, which does not depend on r\i. 

The parameter estimation is then carried out by using the MCMC method as described 
below. 

3.1 Gibbs Sampling with Metropolis Steps 

Gibbs sampling (Geman and Geman, 1984; Gelfand and Smith, 1990) is a popular MCMC 
algorithm for obtaining a sequence of random samples from multivariate probability distribution 
when direct sampling is difficult. The algorithm assumes that the conditional distribution of 
each random variable is known and it is easy to sample from it. The steps of the sampling 
procedure are the following. 

Step 1. Set an arbitrary initial value for the vector of parameters 77, namely, 
V {0) = (v { i\---,V^ P+q Y- Letm = 0; 

Step 2. Given the sample r/( m ) = (77} , • • • , rjj^, p+q )', 

, (m+l) r I 1 v (m) (m) (m) r \ 

-generate rj[ from p(j]i\X,r]\ \r]\ ', ■ ■ ■ ,% + ' p+q , I ); 

, (m+l) r ( w (m+l) (m) (m) T \ 

-generate rfc from p{i] 2 \X,r]l ',rj K 3 ' ,■ ■ ■ ,r] y 5+p+q ,I ); 



-generate rgg* from p(r, 5+P+q \X , r,^ +1 \ ■ ■ .,#££, J ); 

Step 3. Once the vector r/( m+1 ) = (r?} m , ••• ,rj^ + )' is obtained, return to step 2, with 
m = m + 1, until m = N, where N is the desired sample size. 

When it is not possible to sample directly from p(rji\X,rf/_ j),/n)> for one or more i £ 
{!,-•• , 5 + p + q}, an alternative option is to consider a combination of Gibbs sampler and 



1 In fact, the priors 7Tj(-) are not necessarily probability density functions. For instance, ir(x) = 1 and ir(x) = 
1/x, are examples of improper priors (i.e., they do not integrate to 1) used in practice. 
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Metropolis-Hastings (Metropolis et al., 1953; Hastings, 1970) algorithms. This method is usually 
referred to as Gibbs sampler with Metropolis steps. In this case, to draw the random variate in, 
one shall follow the same steps 1-3 just described. However, instead of sampling directly from 
p(r]i\ X,T7(_j), To), one shall consider the Metropolis-Hastings algorithm with p(j]i\X,r]^_^, To) 
as the invariant (target) distribution. 

Metropolis-Hastings algorithm is easy to implement since it does not require knowing the 
normalization constant p^_^(X, tj/^^Iq), defined in (12). For simplicity of notation, in what 
follows p*(-) shall denote any one of the non- normalized probability density function which 
corresponds to p(rn\X, rj/^jTo) , for i G {1, • • ■ , 5 + p + q}. The Metropolis-Hastings sampling 
procedure consists if the following steps. 

Step 1. Select a transition kernel 2 (also called proposal distribution) q(-\-) for which the sam- 
pling procedure is known. 

Step 2. Set an arbitrary initial value yo for the chain. Let m = 0. 

Step 3. Generate a draw £ from q(-\y m )- 

G . , n i i . / t % ■ J\ p*(0i(ym\0 
Step 4. Calculate a{y m , M = mm < 1, — - — — - — - 

I p*{ym)q(^\y m ) 

Step 5. Draw u~U [0,1]. 

£, ifu < a(y m ,g); 



Step 6. Define y m+ i , 

[ y m , otnerwnise. 

Step 7. If m + 1 < iV (where iV is the desired sample size), let m = m + 1 and go to Step 3. 

Remark 1. 

1. When considering Gibbs sampler with Metropolis steps only one iteration of Metropolis- 
Hastings algorithm is performed for each Gibbs sampler iteration. 

2. In both cases, Gibbs sampler and Metropolis-Hastings algorithm, it is advised to discard 
the first B (for some B < N) observations (that is, the burn-in sample) to assure the chain 
convergence. 

3. The sample obtained from the algorithm described above is not independent. An alternative 
is to run parallel chains instead. Another common strategy to reduce sample autocorrelations 
is thinning the Markov chain, that is, to keep only every /c-th simulated draw from each 
sequence. There is some controversy surrounding the question of whether or not it is better 
to run one long chain or several shorter ones (Gelman and Rubin, 1992; Geyer, 1992). Also, 
MacEachern and Berliner (1994) show that one always get more precise posterior estimates 
if the entire Markov chain is used instead of the thinned one. 



4 Simulation Study 

This simulation study considers FIEGARCH(0, d, 0) processes. Under this scenario, the vector 
of unknown parameters is t] = (u, d, 9, 7, uj)' := (771, • • • , 775)'. The Bayesian inference approach, 
using MCMC to obtain posterior density functions, is used to estimate the parameters of the 
model. 



2 A transition kernel is a function q(x\y) which is a probability measure with respect to x, so J q(x\y)dx = 1. 
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4.1 Data Generating Process 

The samples from FIEGARCH(0, d, 0) processes are obtained by setting the following. 



Zq ~ GED(z/), with zero mean and variance equal to one. Thus, 



Pz{z\ v ) 



■expj-^A- 1 !"} 



2 -2/v 



r(iH 
r(3/i/) 



1/2 



for all z£R; 



d G {0.10, 0.25, 0.35, 0.45} and v G {1.1, 1.5, 1.9, 2.5, 5}; 



• for all models, u = —5.40, = —0.15 and 7 = 0.24. These values are close to the ones 
already observed in practical applications (see, for instance, Nelson, 1991; Bollerslev and 
Mikkelsen, 1996; Ruiz and Veiga, 2008; Lopes and Prass, 2013). 

• the infinite sum in (9) is truncated at m* = 50, 000. 

For each combination of d and u, a sample {zt}™ = _ m *, of size m* + n + 1, is drawn from 
the GED(i/) distribution and then the sample {xt}™ =1 , from the FIEGARCH(0, d, 0) process, is 
obtained through the relation 



ln(of ) = u) + ^2 ^d,k9(zt-l-k) and x t = a t z t , for all t = 1, 



, n. 



k=o 



4.2 Parameter Estimation Settings 

The samples from the posterior distributions are obtained by considering the Gibbs sampler 
algorithm with Metropolis steps as described in Section 3. The transition kernel q(-\-) considered 
in the Metropolis-Hastings algorithm is the function defined as 



where /(•; •, ■ 



q{x\y) = f{x;y,a,a,b), 
is the truncated normal density function, defined as 

1 <K^) 



f(x;n,a,a,b) 



cr<J>(^) _$(^ZM) 



0. 



if a < x < b, 



otherwise, 



where </>(•) and $(•) are, respectively, the probability density and cumulative distribution func- 
tions of the standard normal distribution; a, b G K, are, respectively, the lower and upper limits 
of the distribution's support; [i and a denote, respectively, the distribution's (non-truncated 
version) . 

To select a reasonable r/°\ p x {X\rj, J^) is calculated for different combinations of u, d, 6, 7 
and co. Then rj(°' is defined as the vector rj = (y, d, 9, 7, a;)' with higher likelihood function 
value. To eliminate any dependence on the initial t/ ' a burn-in of size 1000 is considered. 

A sample obtained by the method being described will probably present significative cor- 
relation 3 . However, due to the ergodicity property of the Markov chain, the estimation of the 



3 In fact, for the parameter d, this correlation could only be removed when the thinning parameter t was set 
to 200. 
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mean is not affect by the correlation in the sample. Therefore, to avoid unnecessary computa- 
tional work, which ultimately would not lead to improvement in terms of parameter estimation, 
thinning is not implemented. Nevertheless, an example showing the influence of using the entire 
chain, the thinned chain or only the first 1000 observations of the entire chain (after burn-in) 
is provided in the following. 

Example 1. Let t] = (i/, d, 9, 7, u)' := (7/1, • • • , 775)' and assume that 



T^iim 



d, if 7ft Ei*; 
0, otherwise; 



for each i E {!,-•• ,5}, 



(13) 



C.3 



C4 = 2, c 5 = 1/30, h = (0,oo), h = [0,0.5], I z [-0.5,0], h = h and 



with c\ = 1, C2 
J 5 = [-15,15]. 

In the sequel, {r? 4 - }t =1 denotes the chain of size n obtained from the posterior distribution 
of 7ft, upon considering the prior 7Tj(7ft) defined in (13), for each i E {1, • • • ,5}. Also, b, t and N 
denote, respectively, the burn-in size, the thinning parameter and the sample size of the thinned 
chain 4 obtained from {r/j }£ =1 , for any i E {1, • • • ,5}. 

Figure 1 presents the graph of {rji }£ =1 , for each % E {1, • • ■ , 5}, with n = 200,801. Figure 
1 also shows the thinned chain of size N = 1000 obtained by considering b = 1000 and t = 200. 
Furthermore, Figure 1 gives the sample of size 1000, obtained from {rn }£ =1 by considering 
a burn-in equal to 1000 and no thinning, for each i E {1, • • • ,5}. The true parameter values 
of the FIEGARCH(0, d, 0) model corresponding to these graphs are vq = 1.9, do = 0.25, #0 = 
—0.15, 70 = 0.24 and ojq = —5.4. Figure 2 gives the histogram and kernel density functions 
corresponding to each sample in Figure 1. The graphs of the prior 7Tj(7ft) defined in (13), for 
i E {1, • • • ,5}, are represented in Figure 2 by the dashed lines. For a better visualization of 
the posterior distributions, in Figure 2, the range for the x-axis was restricted to the intervals 
[—1.5,2.5], [—0.5,0], [0,0.5] and [—5.6,-5.1], respectively, for the parameters ^,#,7 and ui. 
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Figure 1: Original chain with sample size 200801 (top row). Thinned chain with sample size 
1000 and thinning parameter equal to 200 (middle row). Unthinned chain with sample size 1000 
(bottom row). For the middle and bottom rows the burn-in size is equal to 1000. The true 
parameter values of the FIEGARCH(0, d, 0) model corresponding to these graphs are v$ = 1.9, 
d = 0.25, 6> = -0.15, 70 = 0.24 and u = -5.4. 



4 Observe that, by setting b — 1000 and t = 200, then a thinned chain of size iV = 1000 can only be obtained 
from {Vi k) } n k =i when n > b + 1 + t(N - 1) = 200,801. 
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Figure 2: Histogram and kernel density functions for the original chain with sample size 
200801 (top row); the thinned chain with sample size 1000 and thinning parameter equal to 
200 (middle row) and the unthinned chain with sample size 1000 (bottom row). For the middle 
and bottom rows the burn-in size is equal to 1000. The dashed lines correspond to the graphs 
of the priors ~Ki{r]i) defined in (13), for i E {1, • • • , 5}. The range for the x-axis was restricted 
to the intervals [—1.5,2.5], [—0.5,0], [0,0.5] and [—5.6,-5.1], respectively, for the parameters 
v, 9, 7 and u. The true parameter values of the FIEGARCH(0, d, 0) model corresponding to 
these graphs are uq = 1.9, do = 0.25, 9q = —0.15, 70 = 0.24 and ujq = —5.4. 

As shown in Figure 1 (see also Table 1), the mean of the posterior distribution does not 
change significantly when the entire sample or the thinned chain is considered instead of the 
unthinned one. On the other hand, Figure 2 reinforces the idea that the entire chain gives better 
estimates for the density function (notice that the curves in the graphs are smoother) . Although 
the thinned chain is not as efficient as the entire chain, it still provides better estimates for the 
density function than the unthinned one. 

Table 1 presents the summary statistics for the samples obtained from the posterior distri- 
bution for each parameter of the FIEGARCH(0, d, 0) model. This table considers the entire, 
thinned and unthinned chains. The statistics reported in this table are the sample mean (7^), 
the sample standard deviation (sd % ) and the 95% credibility interval C/o.95(?7i) for the param- 
eter rji in r] = (v, d, 6, 7, uj)' := (771 , • • • , 775)', for each i E {1, • • • ,5}. The true parameter values 
considered for this illustration are uq E {1.1, 1.5, 2.5, 5.0}, do = 0.25, 6q = —0.15, 70 = 0.24 and 
w = -5.4. 

From Table 1 it is clear that, for any 77j, with i E {1, • • • ,5}, the use of the entire or 
the thinned (thinning parameter t = 200) does not yield significant improvement in terms 
of parameter estimation. Not even the differences in the sample standard deviations or in 
the credibility intervals, which are the statistics affected by the sample correlation, justify the 
computational effort to obtain a sample of size 200,801. The same conclusions are obtained 
when considering do E {0.10,0.35,0.45}. This concludes the example. 



Different prior distributions are tested as explained in the sequel. Since the conditional 
probability density function of Iq given r\ is difficult to obtain, in all scenarios, it is assumed 
that g(Z s ) = 0, for all s < 1, and it is fixed Pi (-\r]) = 1. Moreover, since (9) is well defined 
regardless the relation among the parameters of the model, it is assumed that 



P(~i){V(-i)\Vi. 






for any i E {1, • • • ,5}. 
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Table 1: Summary for the entire, thinned (thinning parameter t = 200) and unthinned sample 
from posterior distributions considering all prior uniforms: mean fji, standard deviation sd % and 
the 95% credibility interval CIo.95(%) for the parameter rn in rj = (u, d, 9, 7, u)' := (771, • • • , 775)', 
for each i £ {1, • • • ,5}. The true parameter values considered in this simulation are uq £ 
{1.1,1.5,2.5,5.0}, d = 0.25, 6 Q = -0.15, 70 = -0.24 and co = -5.4. For the thinned and 
unthinned samples, the burn-in size is b = 1000. 



Chain i/q 



v (sd„) 
C/o.95(f) 



d (sdd) 
C7 .9 5 (d) 



9 (sde) 

CIo.95(0) 



7 (sd 7 ) 

C7o.95(7) 



u (sd w ) 

C7o.95(^) 





1.095 (0.047) 0.263 (0.112) -0.087 (0.038) 0.233 (0.062) -5.469 (0.078) 
[1.006; 1.188] [0.035; 0.464] [-0.164; -0.016] [0.114; 0.358] [-5.612; -5.299] 


0> 


1.478 (0.067) 0.220 (0.081) -0.184 (0.037) 0.240 (0.057) -5.408 (0.058) 
[1.351; 1.612] [0.056; 0.375] [-0.257; -0.111] [0.130; 0.355] [-5.520; -5.291] 


* 


2.077 (0.107) 0.252 (0.075) -0.209 (0.032) 0.220 (0.051) -5.386 (0.058) 
[1.874; 2.297] [0.092; 0.392] [-0.272; -0.148] [0.122; 0.322] [-5.487; -5.261] 




2.727 (0.153) 0.298 (0.056) -0.203 (0.025) 0.205 (0.045) -5.361 (0.053) 
[2.441; 3.040] [0.184; 0.405] [-0.253; -0.153] [0.118; 0.296] [-5.463; -5.252] 




5.227 (0.366) 0.220 (0.051) -0.173 (0.019) 0.294 (0.036) -5.303 (0.039) 
'"' [4.548; 5.978] [0.115; 0.317] [-0.211; -0.135] [0.224; 0.366] [-5.384; -5.230] 




1.095 (0.047) 0.264 (0.110) -0.087 (0.038) 0.233 (0.062) -5.469 (0.080) 
[1.006; 1.193] [0.038; 0.462] [-0.161; -0.015] [0.116; 0.362] [-5.609; -5.288] 




1.480 (0.067) 0.221 (0.077) -0.184 (0.037) 0.240 (0.057) -5.405 (0.060) 
[1.353; 1.613] [0.070; 0.367] [-0.257; -0.104] [0.129; 0.352] [-5.524; -5.291] 


a 
3 


2.079 (0.103) 0.253 (0.075) -0.210 (0.031) 0.218 (0.052) -5.388 (0.056) 
[1.888; 2.292] [0.098; 0.387] [-0.269; -0.152] [0.120; 0.327] [-5.486; -5.261] 




2.718 (0.154) 0.298 (0.058) -0.202 (0.026) 0.205 (0.044) -5.361 (0.053) 
[2.427; 3.064] [0.181; 0.410] [-0.255; -0.153] [0.121; 0.291] [-5.466; -5.254] 




5.224 (0.363) 0.218 (0.052) -0.173 (0.019) 0.295 (0.037) -5.302 (0.039) 
[4.534; 5.991] [0.112; 0.316] [-0.211; -0.137] [0.224; 0.370] [-5.391; -5.230] 




1.108 (0.039) 0.265 (0.129) -0.089 (0.038) 0.230 (0.060) -5.476 (0.052) 
[1.028; 1.198] [0.016; 0.467] [-0.176; -0.019] [0.108; 0.345] [-5.553; -5.366] 


-a 


1.474 (0.067) 0.250 (0.071) -0.175 (0.035) 0.240 (0.053) -5.426 (0.057) 
[1.352; 1.644] [0.123; 0.393] [-0.248; -0.100] [0.135; 0.353] [-5.535; -5.344] 


J 
£ 

P 


2.072 (0.109) 0.245 (0.074) -0.210 (0.032) 0.223 (0.059) -5.400 (0.062) 
[1.885; 2.303] [0.068; 0.400] [-0.273; -0.152] [0.116; 0.349] [-5.537; -5.244] 




2.720 (0.148) 0.308 (0.060) -0.200 (0.026) 0.198 (0.042) -5.367 (0.058) 
[2.412; 3.016] [0.192; 0.426] [-0.257; -0.155] [0.119; 0.281] [-5.462; -5.274] 




5.311 (0.356) 0.226 (0.045) -0.176 (0.019) 0.293 (0.036) -5.291 (0.035) 
[4.676; 6.070] [0.137; 0.316] [-0.215; -0.140] [0.225; 0.368] [-5.346; -5.246] 



In a first moment the prior distributions for is, d, 6, 7 and u are selected by considering only 
the basic set of information usually available in practice. The information on each parameter 
and the corresponding prior selected are given in Table 2. This scenario shall be referred to as 
Case 1. Table 3 presents the mean, standard deviation, lower and upper limits for the transition 
kernel considered at iteration m of the Gibbs sampler with Metropolis steps, when the prior for 
rji in rj = (u, d, 6, 7, uj)' := (771, • • • , 775)', for each i € {1, • • ■ ,5}, is defined according to Case 1. 

In a second moment the knowledge on the true parameter values is gradually incorporated 
to provide more informative priors for d, 9 and/or 7. This analysis, combined with the first 
scenario, provides information on the sensitivity of the estimates with respect to the priors 
functions and hyperparameters. In all cases, the priors for v and u are the same and are the 



Taiane S. Prass, Silvia R.C. Lopes and Jorge A. Achcar 11 

Table 2: Information available in practice for the parameter 77^ in r] = (z/, d, 9, 7, u) 1 := 
(Vij ' " ' iVs)' an d the corresponding prior considered, for each i 6 {1, • • • , 5}. 

Information Available Prior 



The generalized error distribution is well defined for any v > 0. v ~ I(o,oo)( zy ) * 



Long-memory in volatility is observed if and only if d € (0, 0.5). This d ~ U(0, 0.5) 

characteristic can be detected, for example, through the periodogram 
function of the time series {ln(X 4 2 )}™ =1 (see Lopes and Prass, 2013). 

0~W(-1,O) 



Empirical evidence 


suggests that 9 G 


[-1, 


0]. 


** 


Empirical evidence 


suggests that 7 G 


[0,1 


]■ ! 


!=* 



7 ~ 1/(0,1) 



w = E(ln(/i 2 )) = E(ln(X 2 )) + E(ln(Z 2 )). u ~ W(-15, 15). 

The choice of the interval for to will depend on the magnitude of the data. 
The sample mean of {ln(X 2 )}™ =1 or ln(oi^), where o\ is the sample 
variance of {X(}™ =1 , can be used to obtain a rough approximation for u 

Notes: * Given A C H, the symbol 1a(x) denotes the improper prior defined as 1, if x £ A, and 0, if x £ A. 

** See, for instance, Nelson (1991); Bollerslev and Mikkelsen (1996); Ruiz and Veiga (2008); Lopes and 
Prass (2013). To the best of our knowledge, a FIEGARCH model for which 9 or 7 are not in the intervals, 
respectively, [—1,0] and [0, 1] has never been reported in the literature. 

Table 3: Parameters of the truncated normal distribution (transition kernel) considered, at 
iteration m of the Gibbs sampler, to obtain the sample from the posterior distribution of the 
parameter rji in r] = (u,d, 9, 7, uj)' := (771 , • • • ,%)', for each % E {1, • • • ,5}. 



Parameter 


;/ 


d 


9 


7 


UJ 


Mean (y) 


2,(771-1) 


d {ra-l) 


Q{m-l) 


ry{rn-l) 


w (m-i) 


Standard Deviation (a) 


0.500 


0.025 


0.050 


0.050 


1.500 


Lower Limit (a) 


0.000 


0.000 


-1.000 


0.000 


-15.000 


Upper Limit (6) 


10.000 


0.500 


0.000 


1.000 


15.000 



Note: r)l m ~ , for any i € {1, • • • , 5}, denotes the parameter value obtained in the (m — l)th iteration. Different 
combinations of standard deviation, lower and upper limits were tested for the parameter r\i in r\ = 
(v, d, 8, 7, u))' , for each i 6 {!,■•■ ,5}. The values presented here correspond to the final choice. 



ones defined in Table 2. The scenarios considered in this second step are described in the 
following and shall be referred to as Case 2 - Case 5. 

Case 2: Gaussian Prior for x = <p ~ 1 (d) and Uniform Priors for 9 and 7. 

In this case 9 and 7 remain with the same priors as in Case 1. For the parameter d it is assumed 
that x ~ AA(^0, <t?) and d = <f>{x), where <j> : K — ¥ (0, 0.5) is given by 

6(x) = — r, for all x £ R. (14) 

^ v ; 2(1 + e x ) v ; 

First, the knowledge of do is applied to set fi^ = _1 (<io), so fj,^ 6 { — 1.386, 0.000, 0.847, 2.197}, 
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respectively, for do £ {0.10,0.25,0.35,0.45}. This scenario shall be referred to as C2.1. Second, 
the knowledge on do is ignored and the parameter /j,^ is assumed to be equal to zero. This 
scenario shall be referred to as C2.2. For both, C2.1 and C2.2, different values of a<f> are tested. 
Third, the approaches considered in C2.1 and C2.2 are combined by setting /j,^ = (f)~ l (d), where 
d is the estimate of d obtained in C2.2. This scenario shall be referred to as C2.3. The value of 
(T^ considered in C2.3 is the one which provides better estimates for d in C2.1. 

The kernel parameter values for v, 9, 7 and u are the same as in the Case 1. For x = <j)~ 1 (d), 
at iteration m of the Gibbs sampler, the kernels mean (y), standard deviation (sd), lower (a) 
and upper limits (b) are set, respectively, as x^-" 1 ^ 1 ' , 1, -10 and 10, where x( m ~ 1 * ) is the parameter 
value obtained at iteration m — 1. 

Case 3: Gaussian Prior for x = ip~ 1 {d), Beta Prior for — 9 and Uniform Prior for 7. 

In this case, the priors of 7 and d are the same ones considered, respectively, in Case 1 and in 
scenario C2.1 of Case 2. It is also assumed that —9 ~ Beta(ai, 61), which is equivalent to set 

7r 3 (0) = {-9) a '-\l + 9) b ^ 1 B(a 1 , 61), 8 G [-1, 0], 

where B(-, •) is the beta function. 

First, the fact that X ~ Beta(a,6) implies ¥,(X) = a(a + &) _1 , is applied to set 61 = 
ai(l + 9q)(—9o)~ 1 , where #0 = —0.15 is the true parameter value considered in this simulation 
study. Different values of a± are tested. This scenario shall be referred to as C3.1. Second, the 
knowledge on #0 is ignored and different combinations of a\ and b\ are tested. This scenario 
shall be referred to as C3.2. Third, the approaches considered in C3.1 and C3.2 are combined 
by setting b\ = a\{l + #o)( — ^o) -1 > where — #0 is the estimate of 9 obtained in C3.2. The value of 
a\ considered in this case is the one which provides better estimates for 9 in C3.1. This scenario 
shall be referred to as C3.3. 

The kernel parameter values are the same as in Case 2. 

Case 4: Gaussian Prior for x = 4>~ l {d), Beta Priors for —9 and 7. 

In this case, the priors of d and —9 are the same ones considered, respectively, in scenario C2.1 of 
Case 2 and in scenario C3.1 of Case 3. It is also assumed that 7 ~ Beta(a2, 62). Two scenarios, 
denoted by C4.1 and C4.2 are considered. With the obvious identifications, the construction 
of C4.1 and C4.2 is analogous, respectively, to the construction of scenarios C3.1 and C3.2 in 
Case 3. 

The kernel parameter values are the same as in Case 2. 



Case 5: Beta Priors for 2d, — 9 and 7. 

In this case, the priors of —9 and 7 are the same ones considered, respectively, in scenario 
C3.1 of Case 3 and in scenario C4.1 of Case 4. Moreover, for each do £ {0.10,0.25,0.35,0.45} 
considered in this simulation study, it is assumed that 2d ~ Beta(a3, 63), which is equivalent to 

set 

TT 2 (d) = 2(2d) as - 1 (l - 2d) 63 - 1 £(a 3 ,& 3 ), d e [0,0.5], 

where B(-, ■) is the beta function. 
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In this case, only two scenarios are considered. First, it is assumed that 63 = (23(1 — 
2do)(2do)~ 1 and different values of 03 are tested. This scenario shall be referred to as C5.1. 
Second, an approach similar to scenarios C3.3 and C4.3, respectively, in Case 3 and Case 4, is 
considered. However, in this case, it is assumed that 63 = 03(1 — 2d)(2d)~ 1 , with d obtained in 
Case 1. The value of 03 considered in this case is the one which provides better estimates for d 
in C5.1. This scenario shall be referred to as C5.2. 

The kernel parameter values are the same as in Case 1. 

4.3 Estimates and Performance Measures 

Let {r]\ }f?L 1 be a sample of size M from the posteriori distribution of rji in 77 = [y, d, 9, 7, uj)' := 
0?i> • • • j % )') for any i S {1, • • • ,5}. Denote by fji and sd % , respectively, the sample mean and 
standard deviation of {t]\ j^Li, namely, 



M 
1 

m 



1 

jjXN? and sd ^ 



M 
fc=i 



\ 



I M 

V7^2(Vi -fji) 2 , for any i E {1, ■ ■ • , 5}. 



M 

k=l 



Then the estimate fji of r\i is defined as fji := rji. 

Moreover, let qi(a) denote the a quantile 5 for the posterior sample distribution of r^, for 
any a S [0, 1] and i E {1, • • ■ ,5}. Then a 100(1 — a)% credibility interval for r/j is given by 

Ch- a {rj,i)= * (o" ) > ® (~ 2~ ) ' for any i E {1, ■ • • , 5}. 

Furthermore, the estimation bias and the absolute percentage error (ape) of estimation are 
given, respectively, by 



bias,^ = rji — rji and ape^. 



bias^ 



for any i E {!,-•• ,5}. 



4.4 Results 

The results obtained in this simulation study, by considering the scenarios described in Section 
4.2, are the following. 

Case 1: The Priors as Defined in Table 2. 

Table 4 present the summary statistics for the samples obtained from the posterior distribution 
for each parameter of the FIEGARCH(0, d, 0) model. The statistics reported in this table (the 
same applies to Table 5) are the sample mean (fji), the sample standard deviation (sd^) and 
the 95% credibility interval C/o.95(??i) for the parameter r/i in rj = (y, d, 9, 7, u)' := (771, • • • , 775)', 
for each i E {1, • • • ,5}. The bold- face font for the mean indicates that the absolute percentage 
error of estimation (ape„.) in the corresponding case is higher than 0.10 (that is, 10%). The 
bold-face font for the credibility interval indicates that the true parameter value is not contained 
in the interval. 



5 In this work, the following definition is adopted (Brockwell and Davis, 1991). Given any < a < 1, the 
number q(a) satisfying V(X < q(a)) > a and P(X > q(a)) > 1 — a, is called a quantile of order a (or a quantile) 
for the random variable X (or for the distribution function of X). 
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Table 4: Summary for the sample obtained from posterior distributions considering all prior 
uniforms: mean fji, standard deviation sd % and the 95% credibility interval C/o. 95(7/4) for the 
parameter rji in r] = (V, d, 0, 7, ui)' '■= (t]i, • • • , 775)', for each i G {1, ■ ■ ■ , 5}. The true parameter 
values considered in this simulation are do G {0.10,0.25,0.35,0.45}, fo G {1.1,1.5,2.5,5.0}, 
O = -0.15, 70 = -0.24 and w = -5.4. 



do v 



0.10 



1.1 



1.5 



1.9 

2.5 



5.0 



v (sd„) 
C/0.95H 



d (sdd) 

CJ .95(d) 



(sda) 
C/0.95W 



1.093 (0.044) 
[0.989; 1.195] 

1.480 (0.069) 
[1.353; 1.635] 

2.088 (0.111) 
[1.908; 2.296] 

2.724 (0.140) 
[2.491; 3.027] 

5.297 (0.364) 
[4.641; 6.014] 



0.181 (0.123) 
[0.005; 0.458] 

0.147 (0.079) 
[0.020; 0.330] 

0.093 (0.055) 
[0.004; 0.201] 

0.192 (0.076) 
[0.038; 0.330] 

0.101 (0.051) 
[0.015; 0.217] 



-0.084 

[-0-171; 

-0.177 

[-0.258; 



-0.220 
-0.286; 

-0.201 
-0.256; 



-0.174 

[-0.215; 



0.041) 
•0.013] 

0.038) 
-0.106] 



0.032) 
-0.154] 

0.025) 
■0.153] 



0.020) 
-0.133] 



7 (sd 7 ) 
CJq. 95 (7) 

0.236 (0.066) 
[0.093; 0.357] 

0.232 (0.052) 
[0.122; 0.340] 

0.216 (0.060) 
[0.105; 0.337] 

0.198 (0.045) 
[0.116; 0.287] 

0.297 (0.036) 
[0.232; 0.374] 



w (sd w ) 

C7 .95(w) 



-5.336 
5.383; 



5.438 

-5.551: 

■5.420 

-5.510: 



5.410 
-5.486; 

5.388 
-5.448; 



0.058) 
-5.337] 

0.036) 

-5.338] 



0.035) 
-5.340] 

0.031) 
-5.333] 



0.028) 
-5.287] 



0.25 



1.1 



1.5 



1.9 



2.5 



5.0 



1.108 (0.039) 
[1.028; 1.198] 

1.474 (0.067) 
[1.352; 1.644] 



2.072 

[1.885 


(0.109) 
2.303] 


2.720 

[2.412 


(0.148) 
3.016] 



5.311 (0.356) 
[4.676; 6.070] 



0.265 (0.129) 
[0.016; 0.467] 

0.250 (0.071) 
[0.123; 0.393] 

0.245 (0.074) 
[0.068; 0.400] 

0.308 (0.060) 
[0.192; 0.426] 

0.226 (0.045) 
[0.137; 0.316] 



-0.089 

[-0.176; 



0.038) 
-0.019] 



-0.175 

[-0.248; 

-0.210 
-0.273; 



0.035) 
•0.100] 

0.032) 
-0.152] 



-0.200 
-0.257; 



0.026) 
-0.155] 



-0.176 

[-0.215; 



0.019) 
-0.140] 



0.230 (0.060) 
[0.108; 0.345] 

0.240 (0.053) 
[0.135; 0.353] 

0.223 (0.059) 
[0.116; 0.349] 

0.198 (0.042) 
[0.119; 0.281] 

0.293 (0.036) 
[0.225; 0.368] 



5.476 
-5.553; 



0.052) 
-5.366] 



5.426 
-5.535; 

■5.400 
-5.537; 



0.057) 
-5.344] 

0.062) 
-5.244] 



5.367 
-5.462 



0.058) 
-5.274] 



-5.291 
5.346; 



0.035) 
-5.246] 



0.35 



1.1 



1.5 



1.9 



2.5 



5.0 



1.099 (0.040) 
[1.009; 1.194] 

1.479 (0.065) 
[1.352; 1.639] 

2.064 (0.110) 

[1.843; 2.299] 

2.732 (0.150) 
[2.481; 3.031] 

5.229 (0.321) 
[4.603; 5.864] 



0.349 (0.108) 
[0.093; 0.492] 

0.329 (0.065) 
[0.204; 0.461] 

0.364 (0.054) 
[0.227; 0.461] 

0.380 (0.052) 
[0.283; 0.479] 

0.318 (0.040) 
[0.243; 0.409] 



-0.097 

[-0.178; 

-0.178 

[-0.246; 



0.038) 
■0.027] 

0.036) 
-0.106] 



-0.199 

[-0.265; 



0.030) 
-0.139] 



-0.201 
-0.254; 



0.024) 
-0.161] 



-0.177 

[-0.216; 



0.019) 
-0.140] 



0.230 (0.056) 
[0.121; 0.330] 

0.246 (0.052) 
[0.143; 0.340] 

0.233 (0.050) 
[0.139; 0.330] 

0.200 (0.043) 
[0.110; 0.285] 

0.289 (0.036) 
[0.227; 0.366] 



5.495 
-5.674: 

5.423 

-5.561: 



0.090) 
-5.318] 

0.076) 
-5.302] 



5.377 
-5.535 



0.090) 
-5.199] 



5.307 
-5.410; 



0.066) 
-5.149] 



-5.227 
5.298; 



0.046) 
-5.127] 



0.45 



1.1 



1.5 
1.9 



2.5 



5.0 



1.096 (0.039) 
[1.010; 1.161] 

1.475 (0.073) 
[1.353; 1.627] 



2.052 
[1.846 


(0.111) 
; 2.279] 


2.725 
[2.461 


(0.152) 
; 3.021] 



5.177 (0.322) 
[4.553; 5.832] 



0.436 (0.053) 
[0.313; 0.499] 

0.411 (0.048) 
[0.311; 0.494] 

0.450 (0.032) 
[0.385; 0.497] 

0.447 (0.032) 
[0.381; 0.495] 

0.417 (0.032) 
[0.348; 0.480] 



-0.115 

[-0.187; 



0.034) 
-0.047] 



-0.179 

[-0.246; 

-0.191 

[-0.238; 



0.034) 
•0.110] 

0.026) 
•0.141] 



-0.206 
-0.247; 



0.021) 
-0.167] 



-0.177 

[-0.220; 



0.019) 
-0.140] 



0.241 (0.054) 
[0.136; 0.338] 

0.257 (0.048) 
[0.158; 0.353] 

0.243 (0.043) 
[0.165; 0.320] 

0.211 (0.041) 
[0.133; 0.296] 

0.286 (0.032) 
[0.228; 0.350] 



5.453 
-5.716 



0.128) 

-5.174] 



5.411 
5.600; 

5.367 
-5.614; 



0.123) 
-5.133] 

0.130) 
-5.125] 



-5.150 
-5.310; 



0.081) 
-4.985] 



-5.041 
-5.182; 



0.068) 
-4.929] 



Note: The bold-face font for the estimated mean indicates that the absolute percentage error is higher than 
10%. The bold-face font for the credibility interval indicates that the interval does not contain the true 
parameter value. 
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From Table 4 one observes that the parameters v and u are always well estimated, in terms 
of absolute percentage error (ape), regardless the combination of do G {0.10,0.25,0.35,0.45} 
and vq G {1-1, 1.5,2.5,5.0} considered (the error is less than 10% in all cases). The credibility 
interval CIq.^{v) contains the true parameter value vq in all cases, except when do = 0.10 
and vq = 1.9. Also, the estimation bias for v is always negative when vo < 1.9 and positive 
when vo > 1-9, except for the combination (vo,do) = (1.1,0.25). For the parameter w, the 
credibility interval CIo.^{oj) does not contain the true parameter value {ujq = —5.4) in 5 out of 
20 combinations of vq and do (see vq = 5 and all do; vq = 2.5 and do = 0.45). Moreover, the 
estimation bias for oo is always negative when vq < 1.5 and always positive when vq > 2.5. 

Table 4 also reports that ape# > 10% for all combinations of do and vq. On the other hand, 
in most cases (14 out of 20), the credibility interval CIo.^{9) contains the true parameter value 
#o = —0.15. The cases for which 9q ^ ClomiO) are v = 1.9 and do £ {0.10,0.25} and vo = 2.5 
and any do- The bias for 9 is always positive when vq = 1.1 (for any do) and negative in all 
other cases. 

Furthermore, Table 4 shows that the parameter 7 seems to be better estimated when the 
GED distribution presents heavy tails (^0 < 2), except when d = 0.10, in which case ape 7 > 10% 
when vo = 1.9. Also, with exception of four cases (do = 0.10 and vo G {1-1, 1-5, 2.5}; do = 0.25 
and vo = 2.5), the parameter d is always well estimated. The bias for parameters d and 7 does 
not seem to follow any pattern and both, do G CIq. 95(d) and 70 G (7/0.95(7); f° r an y combination 
of v and do . 



Case 2: Gaussian Prior for x = <ft 1 (d) and Uniform Priors for 9 and 7. 

Changing the prior for d does not yield significant difference on the estimation of v, 9, 7 and ui. 

When the true value of do is used to set //^ = c/> _1 (do) (scenario C2.1), the best performance 
is observed by letting o^ = 0.15. In this case, the absolute percentage error of estimation (ape rf ) 
is smaller than 10% for all combinations of vq and do, with do G {0.10,0.25,0.35,0.45} and 
vo G {1.1,1.5,2.5,5.0}. If 0$ = 0.10 the chain takes too long to move from the initial point 
when do = 0.10. When a^ = 0.25, there is only one case for which ape rf > 10% (do = 0.10 and 
vo = 2.5). In fact, in this case, ape d = 0.103, which is still acceptable (a^ = 0.15 still seems to 
be the best choice). Furthermore, as a^ increases, the number of cases for which ape d > 10% 
also increases. For instance, when a^ G {0.50,1.00,3.00}, ape d > 10% in 2, 4 and 10 cases, 
respectively. 

When do is assumed unknown and /i^ is set to zero (scenario C2.2), a^ = 3 seems to provide 
better results than smaller values of o^. Under this scenario, ape d > 10% for 8 out of 20 
combinations of vq and do- Therefore, d ~ M(0, 0.5) still provides better estimates for the 
parameter d (see Table 4). Higher values of a^ do not improve the estimation of d. Too high 
values of o<f, actually make the estimation worst. In particular, when a$ = 4 the results are 
similar to o^ = 3, if do > 0.1. If do = 0.1 then a^ = 3 is slightly better than a^ = 4. When 
0$ = 5, ape^ is, in most cases, higher than when a$ = 3. For 0$ smaller than 3 the estimation 
bias is much higher. For instance, when a^ = 0.15, ape d < 10% only for do = 0.25 (for all vq). 
For all other combinations of do and vq ape d > 20%. Also, when b = 1, ape d > 20% in 12 out 
of 20 cases. In particular, ape d > 20% for do = 0.10 and all vq. As it should be expected, C2.1 
performs much better than C2.2. 

Upon considering a two step estimator (scenario C2.3), no improvement is observed, when 
compared to scenario C2.2. In fact, the estimates obtained by letting /i^ = (p~ 1 (d) (where d 
is the estimate of d obtained in C2.2) and o^ = 0.15 (the parameter which leads to the best 
performance in C2.1) are very close to d itself. 
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Case 3: Gaussian Prior for x = <f> (d), Beta Prior for —9 and Uniform Prior for 7. 

The estimation of u, d, 7 and u> is not significantly affected by the change in the prior for —9. 

When the knowledge on the true parameter values do and 9q is applied to set /x,£ = <p~ 1 (do), 
o^ = 0.15, for each do G {0.10,0.25,0.35,0.45} (the best scenario in Case 2), and b\ = ai(l + 
$o)( — $o) -1 (scenario C3.1), it is observed that larger values of a\ lead to better estimates 
for 9. Although any a± G {110,150,200} leads to ape < 10%, for all combinations of do G 
{0.10,0.25,0.35,0.45} and uo G {1.1,1.5,2.5,5.0}, the best performance is obtained by setting 
a\ = 110 {pi m 623.333). As a\ decreases, the estimation performance decreases. For instance, 
when a\ = 100, one case for which ape > 10% is observed. When a\ = 20 the number of 
cases for which ape e > 10% increases to 10 and no case for which ape e < 10% is observed if 
ai G {2.0,0.1}. More specifically, for any ai G {2.0,0.1}, 10% < ape e < 20% for u G {1.5,5.0} 
and all values of do (8 out of 20 cases) and, in the remaining 12 cases, ape^ > 20%. 

By assuming #0 unknown (scenario C3.2) or by considering a two step estimator (scenario 
C3.3), no case for which ape e < 10% is observed. The combinations of a\ and b\ tested 
in scenario C3.2 are: (ai,6i) G {(2,3), (2,5), (2,9), (4,7), (5,7), (10,40), (10,60), (10,70), 
(100,500), (100,600)}. Among these values, the best performance is obtained when a\ = 10 
and b\ = 50. In this case, 10% < ape e < 20% in 12 out of 20 cases, which is slightly better than 
the performance obtained assuming 9 ~ U(0, 1) (in this case, 10% < ape e < 20% in 8 out of 20 
cases). 

Case 4: Gaussian Prior for x = 4>~ l (d), Beta Priors for —9 and 7. 

Analogously to Case 2 and Case 3, the estimation of v, d, 9 and 00 is not significantly affected 
by the change in the prior for 7. 

By considering the true parameter values do, 9q and 70 and setting fi^ = (ft^ 1 ^), 04, = 0.15, 
for each do G {0.10,0.25,0.35,0.45} (the best scenario in Case 2), a\ = 110, 61 = a\{\ + 
$o)( — #o) -1 (the best scenario in Case 3) and 62 = 02(1 — 7o)7cT (scenario C4.1), it is observed 
the following: larger values of 02 (smaller than 01, however) lead to better estimates for 7 and 
as a,2 decreases, the estimation performance decays. For instance, when 02 = 40 only one case 
for which ape e > 10% is observed and when ai G {10,25,30}, the number of cases increases 
to 5 (do = 0.45 and all fo). On the other hand, any 02 G {50, 100} gives ape^ < 10%, for all 
combinations of do G {0.10,0.25,0.35,0.45} and vq G {1.1,1.5,2.5,5.0}. The simulation results 
for 02 = 50 (62 ~ 158.333) are illustrated in Figure 3. 

Figure 3 shows the sample mean (solid circle) and the 95% credibility interval (solid line) 
for the sample obtained from the posterior distribution of v,d,9,^f and lo (respectively, from 
top to bottom), for each combination of do and vo- The true parameter values vo-, do, do, 7o and 
too are represented in the corresponding row by the dashed line. The graphs related to 9, 7 and 
u (respectively, the third, fourth and fifth rows, from top to bottom) consider the same scale 
for all do G {0.10,0.25,0.35,0.45}. Also, for the parameters #,7 and to, there is one graph for 
each do and, for each one of these graphs, the true value of 1/0 is indicated in the x-axis. 

From Figure 3 one observes that, for v and u, the conclusion regarding the estimation bias 
and the credibility intervals are basically the same as in Case 1 (see Table 4). On the other 
hand, under C4.1 of Case 4), ape % < 10%, for all i G {1, ••■ ,5} and any combination of 
do G {0.10,0.25,0.35,0.45} and f G {1.1,1.5,2.5,5.0} (compare the parameter 9 in Table 4 
and Figure 3). As in Case 1, the bias for 9 is always positive when vo = 1.1 (for any do) and 
negative when uq > 1-1 an d the bias for the parameters d and 7 does not seem to follow any 
pattern. Under the current scenario, do, 9q and 70 are all contained in the respective credibility 
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Figure 3: Posterior mean (solid circle), the true parameter value (dashed line) and the 95% 
credibility interval (solid line) for the parameters 1^,(1,6,^ and ui (from top to bottom), for 
each combination of do and v$. The posterior distributions were obtained by considering an 
improper prior for v, a Gaussian prior for <f>~ 1 (d), Beta priors for —6 and 7 and a uniform prior 
for u. The true parameters values considered in this simulation are d$ G {0.10, 0.25, 0.35, 0.45}, 
v Q 6 {1.1, 1.5, 2.5, 5.0}, 6 = -0.15, 70 = -0.24 and w = -5.4. 



intervals, for any combination of do £ {0.10,0.25,0.35,0.45} and uq £ {1.1, 1.5,2.5,5.0}. 

When the true value of 70 is not used to choose 62 (scenario C4.2) similar results to the 
ones in Figure 3 are still obtained for some combinations of (02,62)- Not surprisingly, the 
pairs (02,62) which lead to good estimates are such that 02(02 + 62) -1 (the mean fis of the 
prior distribution) is close to 70. For instance, when (02,62) € {(100, 300), (100, 350)} (/i£ 
is, respectively, equal to 0.25 and 0.22, while 70 = 0.24) it is obtained ape 7 < 10% for all 
combinations of d G {0.10,0.25,0.35,0.45} and u G {1.1,1.5,2.5,5.0}. The pair (o 2 ,6 2 ) = 
(100,350) provides slightly better results than (02,62) = (100,300) only when v$ = 5. When 
(02, 62) = (100, 280) (so hb ~ 0.26), ape 7 < 10% in 16 out of 20 cases (in the remaining 4 cases, 
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ape 7 does not exceed 13.4%). 

On the other hand, choosing 02 and 62 such that 02(02 + fy?) -1 is close to the true 70 does 
not necessarily lead to good estimates. For instance, if (02,62) = (5,15) then [ib = 0.25, as 
it is when (02,62) = (100,300), but ape 7 > 10% in 6 out of 20 cases. Also, it is not evident 
that the more distant 02(02 + &2) -1 is from 70, the worst is the estimation. For instance, 
by letting (02,62) G {(3, 15), (100,440), (10,30), (10,40), (20,80), (100,400), (100,270), (5,10)} 
then, respectively, [i B G {0.167,0.185,0.200,0.200,0.200,0.200,0.270,0.330} and it is observed 
that ape 7 > 10% in 14, 20, 4, 10, 12, 16, 13 and 5 out of 20 cases, respectively. 



Case 5: Beta Priors for 2d, —9 and 7. 

Analogously to all other cases, the estimation of u, 9, 7 and oj is not significantly affected by 
the change in the prior for d. 

Upon assuming —9 ~ Beta(ai,&i) and 7 ~ Beta(a2,&2), with the same oi, 02, b\ and 62 
as in scenario C4.1 of Case 4, and letting 2d ~ Beta(a3,03), with 63 = 03(1 — 2do)(2do) _1 , 
for each do G {0.10,0.25,0.35,0.45} (scenario C5.1 of Case 5), the following is concluded. If 
03 G {25,50} then ape rf < 10% for all do G {0.10,0.25,0.35,0.45}. By increasing or decreasing 
too much the 03 values the estimation performance decays. For instance, 03 G {0.10, 0.20, 2.00} 
yields ape d > 10% in 3, 1 and 7 cases, respectively. 

Table 5 reports the simulation results for 03 = 25 and 63 = 03(1 — 2do)(2do) _1 , which 
gives 61 G {100.000,25.000,10.714,2.778}, respectively, for d G {0.10,0.25,0.35,0.45}. The 
conclusions on the results presented in this table are the same as in Figure 3. Although the 
credibility intervals for d are slightly wider in Table 5 than in Figure 3, in both tables do G 
CI .9 5 (d) for any combination of d G {0.10,0.25,0.35,0.45} and u G {1.1,1.5,2.5,5.0}. 

As in Case 2, when considering a two step estimator (scenario C5.2), no improvement is 
observed, when compared to Case 1. In fact, once again, the estimates obtained by letting 
03 = 25 and 63 = 03(1 — 2d) (2d) -1 , where d is the estimate of d obtained in Case 1, are very 
close to d itself. 



5 Conclusions 

The Bayesian inference approach for parameter estimation on FIEGARCH models was described 
and a Monte Carlo simulation study was conducted to analyze the performance of the method 
under the presence of long-memory in volatility. The samples from FIEGARCH processes were 
obtained by considering the infinite sum representation for the logarithm of the volatility. A 
recurrence formula was used to obtain the coefficients for this representation. The generalized 
error distribution, with different tail-thickness parameters was considered so both innovation 
processes with lighter and heavier tails than the Gaussian distribution, were covered. 

Markov Chain Monte Carlo (MCMC) methods where used to obtain samples from the 
posterior distribution of the parameters. A sensitivity analysis was performed by considering the 
following steps. First, an improper prior for v and uniform priors d, 9, 7 and 10 were selected. In 
this case, only the basic set of information usually available in practice was considered. Second, 
non-uniform priors were selected for one or more parameters in {d, #,7}. A Gaussian prior for 
_1 (d), with </>(•) defined in (14), combined with uniform or Beta priors for 9 (—9 in the Beta 
case) and 7 was considered. In the sequel, a comparison was made by assuming Beta priors for 
2d, —9 and 7. The sensitivity analysis was completed by integrating (or not) the knowledge on 
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Table 5: Summary for the sample obtained from posterior distributions considering Beta 
priors for 2d,— 9 and 7: mean fji, standard deviation sd % and the 95% credibility interval 
(7/0.95(7/1) for the parameter rji in t) = (z/, d, 0, 7, u;)' := (771, ■ ■ ■ ,775)', for each i € {1, ■ ■ ■ ,5}. 
The true parameter values considered in this simulation are do G {0.10,0.25,0.35,0.45}, vq € 
{1.1, 1.5, 2.5, 5.0}, d = -0.15, 70 = -0.24 and w = -5.4. 



do vo 



v (sdj,) 

C7o.9 5 (l>) 



d (sdd) 
C7o.9s(d) 



(sde) 
C7o.9s(0) 



7 (sd 7 ) 
C/ .9s(7) 



w (sd w ) 

C7 .95(w) 





1.102 (0.045) 0.100 (0.018) -0.144 (0.011) 0.243 (0.026) -5.472 (0.063) 
[1.012; 1.198] [0.069; 0.139] [-0.170; -0.124] [0.199; 0.294] [-5.569; -5.347] 




1.483 (0.063) 0.102 (0.019) -0.154 (0.013) 0.239 (0.024) -5.412 (0.047) 
[1.364; 1.607] [0.069; 0.141] [-0.178; -0.132] [0.190; 0.285] [-5.524; -5.333] 


0.10 


2.067 (0.108) 0.101 (0.016) -0.160 (0.012) 0.234 (0.027) -5.422 (0.034) 
[1.879; 2.309] [0.070; 0.134] [-0.186; -0.137] [0.184; 0.291] [-5.498; -5.360] 




2.702 (0.143) 0.106 (0.019) -0.160 (0.012) 0.231 (0.023) -5.407 (0.022) 
[2.442; 2.977] [0.074; 0.150] [-0.186; -0.136] [0.187; 0.277] [-5.470; -5.375] 




5.251 (0.391) 0.099 (0.017) -0.158 (0.012) 0.263 (0.024) -5.343 (0.036) 
[4.514; 6.080] [0.070; 0.132] [-0.186; -0.134] [0.220; 0.312] [-5.390; -5.265] 




1.095 (0.044) 0.255 (0.031) -0.145 (0.012) 0.242 (0.028) -5.455 (0.057) 
[0.986; 1.190] [0.197; 0.315] [-0.173; -0.123] [0.191; 0.295] [-5.559; -5.348] 




1.485 (0.066) 0.252 (0.030) -0.154 (0.013) 0.240 (0.022) -5.421 (0.046) 
[1.357; 1.628] [0.193; 0.313] [-0.180; -0.132] [0.195; 0.281] [-5.529; -5.324] 


0.25 


2.058 (0.113) 0.260 (0.030) -0.159 (0.012) 0.236 (0.028) -5.431 (0.048) 
[1.864; 2.316] [0.194; 0.309] [-0.185; -0.138] [0.189; 0.293] [-5.525; -5.340] 




2.719 (0.146) 0.271 (0.029) -0.163 (0.012) 0.229 (0.021) -5.386 (0.038) 
[2.452; 3.008] [0.216; 0.326] [-0.188; -0.137] [0.186; 0.275] [-5.469; -5.304] 




5.208 (0.326) 0.244 (0.028) -0.159 (0.013) 0.260 (0.023) -5.313 (0.034) 
[4.548; 5.871] [0.190; 0.299] [-0.185; -0.134] [0.213; 0.309] [-5.381; -5.256] 




1.097 (0.038) 0.355 (0.034) -0.145 (0.012) 0.241 (0.027) -5.469 (0.080) 
[1.014; 1.175] [0.283; 0.413] [-0.169; -0.126] [0.186; 0.298] [-5.630; -5.289] 




1.481 (0.064) 0.349 (0.030) -0.154 (0.012) 0.239 (0.024) -5.447 (0.077) 
[1.359; 1.628] [0.285; 0.403] [-0.178; -0.131] [0.192; 0.285] [-5.587; -5.310] 


0.35 


2.070 (0.102) 0.370 (0.029) -0.160 (0.012) 0.236 (0.027) -5.414 (0.082) 
[1.870; 2.288] [0.306; 0.421] [-0.183; -0.136] [0.186; 0.293] [-5.587; -5.237] 




2.720 (0.147) 0.375 (0.028) -0.163 (0.011) 0.228 (0.023) -5.337 (0.063) 
[2.449; 3.009] [0.321; 0.426] [-0.185; -0.143] [0.186; 0.274] [-5.440; -5.221] 




5.147 (0.346) 0.344 (0.027) -0.159 (0.012) 0.258 (0.023) -5.255 (0.048) 
[4.598; 5.965] [0.287; 0.395] [-0.185; -0.137] [0.214; 0.305] [-5.321; -5.171] 




1.101 (0.042) 0.454 (0.024) -0.145 (0.012) 0.238 (0.026) -5.424 (0.128) 
[1.024; 1.191] [0.402; 0.489] [-0.169; -0.122] [0.189; 0.286] [-5.682; -5.160] 




1.493 (0.073) 0.450 (0.024) -0.154 (0.012) 0.243 (0.024) -5.414 (0.132) 
[1.362; 1.645] [0.395; 0.488] [-0.177; -0.132] [0.195; 0.291] [-5.681; -5.139] 


0.45 


2.045 (0.109) 0.464 (0.019) -0.158 (0.011) 0.239 (0.026) -5.424 (0.126) 
[1.844; 2.241] [0.419; 0.491] [-0.181; -0.134] [0.193; 0.291] [-5.717; -5.216] 




2.742 (0.142) 0.466 (0.017) -0.164 (0.012) 0.228 (0.022) -5.170 (0.084) 
[2.507; 3.010] [0.431; 0.493] [-0.191; -0.141] [0.189; 0.275] [-5.308; -5.002] 




5.164 (0.346) 0.447 (0.022) -0.158 (0.011) 0.256 (0.022) -5.070 (0.076) 
[4.558; 5.883] [0.396; 0.486] [-0.182; -0.137] [0.214; 0.300] [-5.227; -4.942] 



Note: The bold-face font for the credibility interval indicates that the interval does not contain the true 
parameter value. 
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the true parameter values to select the hyperparameter values. 

An example was presented to illustrate the similarities or differences on the mean, standard 
deviation and credibility intervals estimated by considering a chain of size N = 200801, a thinned 
chain (thinning parameter 200 and burn-in size 1000) and a sample of size 1000 (obtained from 
the larger chain, after the burn-in of size 1000). Given the ergodicity of the Markov chain, the 
posterior means for all three chains were very close. The differences on the standard deviations 
and credibility intervals are not significant enough to justify the use of the entire or thinned 
chains. Although the example only presents the case do = 0.25, the same conclusions apply to 
d G {0.10,0.35,0.45}. 

The simulation study showed that if the prior of one or more parameters is changed, the 
estimation of the other parameters is not significantly affected. The parameters v and u are 
always well estimated, in terms of absolute percentage error, regardless priors considered for 
d, 9 and 7, for any combination of d G {0.10, 0.25, 0.35, 0.45} and u £ {1.1, 1.5, 2.5, 5.0}. With 
a few exceptions, the true parameter value uq was contained in the 95% credibility interval, for 
any combination of i/q G {1.1, 1.5, 1.9,2.5,5.0} and do G {0.10,0.25,0.35,0.45} considered. The 
true parameter value coo was not contained in any credibility interval when vq = 5. 

Regardless the prior considered, the parameter d is usually better estimated when d G {0.35, 
0.45}. The Gaussian prior for (f>~ 1 (d) only provided better results (globally) when the knowledge 
on the true parameter value do was used to set \i§ = (j)~ 1 (do) and o^ was set to some value 
smaller or equal than 1. In particular, only when b = 0.15 the absolute percentage error of 
estimation (ape) became smaller than 10% for all do G {0.10,0.25,0.35,0.45}. Although the 
credibility intervals for d are slightly wider when a Beta prior is considered, the use of the 
Beta prior for 2d neither improves nor degrades the estimation performance, compared to the 
Gaussian prior for cj)~ 1 (d). 

The absolute percentage error of estimation for 6 (ape#) only became smaller than 10% 
when the Beta prior was considered and the true value of the parameter was used to select the 
hyperparameter. When 6>o was assumed unknown the ape e was always between 10% and 38.1%. 
The parameter 7 is always better estimated than 6, for any priors considered. Similar to d and 
6, the best performance is obtained when the true parameter value is used to select the hyper- 
parameters. On the other hand, 7 is the only parameter for which there are hyperparameter 
values that do not yield [Ib = 70 (p>B is the mean of the prior distribution and 70 is the true 
parameter value) while still providing good estimates. 
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